Design and Analysis of a Dynamic Load Balancing Strategy for Large-Scale Distributed Association Rule Mining

نویسندگان

  • Raja Tlili
  • Yahya Slimani
چکیده

Association rule mining is one of the most important data mining techniques. Algorithms of this technique search a large space, considering numerous different alternatives and scanning the data repeatedly. Parallelism seems to be the natural solution in order to be able to work with industrial-sized databases. Large-scale computing systems, such as Grid computing environments, are recently regarded as promising platforms for data and computation-intensive applications like data mining. However, to improve the performance and achieve scalability by using these heterogeneous platforms, new data partitioning approaches and workload balancing features are needed. The focus of this paper is to propose a dynamic load balancing strategy for parallel association rule mining algorithms in the context of a Grid computing environment. This strategy is built upon a distributed model which necessitates small overheads in the communication costs for load updates and for both data and work transfers. It also supports the heterogeneity of the system and it is fault

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Association rule mining and load balancing strategy in grid systems

The parallel and distributed systems represent one of the important solutions proposed to ameliorate the performance of the sequential association rule mining algorithms. However, parallelization and distribution process is not trivial and still facing many problems of synchronization, communication, and workload balancing. Our study is limited to the workload balancing problem. In this paper, ...

متن کامل

A Hierarchical Dynamic Load Balancing Strategy for Distributed Data Mining

Extracting useful knowledge from data sets measuring in gigabytes and even terabytes is a challenging research area for the data mining community. Sequential approaches suffer from a performance problem due to the fact that they have to mine voluminous databases. Parallelism is introduced as an important solution that could improve the response time and the scalability of these approaches. Howe...

متن کامل

A Comparative Study of Association Rule Mining Algorithms on Grid and Cloud Platform

Association rule mining is a time consuming process due to involving both data intensive and computation intensive nature. In order to mine large volume of data and to enhance the scalability and performance of existing sequential association rule mining algorithms, parallel and distributed algorithms are developed. These traditional parallel and distributed algorithms are based on homogeneous ...

متن کامل

A Novel Data Partitioning Approach for Association Rule Mining on Grids

Mining association rules refers to extracting useful knowledge from large databases. Algorithms of this technique are both data and computation-intensive, which make grid platforms very attractive for them. However, to exploit these platforms, new data partitioning features are required where the specificities of both association rule mining technique and grids must be taken into consideration....

متن کامل

Performance Evaluation of the Distributed Association Rule Mining Algorithms

One of the best-known problems in data mining is association rule mining. It requires very large computation and I/O traffic capacity, therefore several distributed and parallel association rule mining algorithms have been developed. However the association rule mining problem is NP complete, the execution time estimation of the algorithms can be very important, especially for load balancing or...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011